Goto

Collaborating Authors

 max norm



Matrix reconstruction with the local max norm Nathan Srebro Department of Statistics Toyota Technological Institute at Chicago Stanford University

Neural Information Processing Systems

We introduce a new family of matrix norms, the "local max" norms, generalizing existing methods such as the max norm, the trace norm (nuclear norm), and the weighted or smoothed weighted trace norms, which have been extensively used in the literature as regularizers for matrix reconstruction problems. We show that this new family can be used to interpolate between the (weighted or unweighted) trace norm and the more conservative max norm. We test this interpolation on simulated data and on the large-scale Netflix and MovieLens ratings data, and find improved accuracy relative to the existing matrix norms. We also provide theoretical results showing learning guarantees for some of the new norms.


Matrix reconstruction with the local max norm

Neural Information Processing Systems

We introduce a new family of matrix norms, the ''local max'' norms, generalizing existing methods such as the max norm, the trace norm (nuclear norm), and the weighted or smoothed weighted trace norms, which have been extensively used in the literature as regularizers for matrix reconstruction problems. We show that this new family can be used to interpolate between the (weighted or unweighted) trace norm and the more conservative max norm. We test this interpolation on simulated data and on the large-scale Netflix and MovieLens ratings data, and find improved accuracy relative to the existing matrix norms. We also provide theoretical results showing learning guarantees for some of the new norms.


Towards Deterministic Diverse Subset Sampling

Schreurs, Joachim, Fanuel, Michaël, Suykens, Johan A. K.

arXiv.org Machine Learning

Determinantal point processes (DPPs) are well known models for diverse subset selection problems, including recommendation tasks, document summarization and image search. In this paper, we discuss a greedy deterministic adaptation of k-DPP. Deterministic algorithms are interesting for many applications, as they provide interpretability to the user by having no failure probability and always returning the same results. First, the ability of the method to yield low-rank approximations of kernel matrices is evaluated by comparing the accuracy of the Nystr\"om approximation on multiple datasets. Afterwards, we demonstrate the usefulness of the model on an image search task.


Optimal ANN-SNN Conversion for Fast and Accurate Inference in Deep Spiking Neural Networks

Ding, Jianhao, Yu, Zhaofei, Tian, Yonghong, Huang, Tiejun

arXiv.org Artificial Intelligence

Spiking Neural Networks (SNNs), as bio-inspired energy-efficient neural networks, have attracted great attentions from researchers and industry. The most efficient way to train deep SNNs is through ANN-SNN conversion. However, the conversion usually suffers from accuracy loss and long inference time, which impede the practical application of SNN. In this paper, we theoretically analyze ANN-SNN conversion and derive sufficient conditions of the optimal conversion. To better correlate ANN-SNN and get greater accuracy, we propose Rate Norm Layer to replace the ReLU activation function in source ANN training, enabling direct conversion from a trained ANN to an SNN. Moreover, we propose an optimal fit curve to quantify the fit between the activation value of source ANN and the actual firing rate of target SNN. We show that the inference time can be reduced by optimizing the upper bound of the fit curve in the revised ANN to achieve fast inference. Our theory can explain the existing work on fast reasoning and get better results. The experimental results show that the proposed method achieves near loss less conversion with VGG-16, PreActResNet-18, and deeper structures. Moreover, it can reach 8.6x faster reasoning performance under 0.265x energy consumption of the typical method. The code is available at https://github.com/DingJianhao/OptSNNConvertion-RNL-RIL.


Robust Matrix Completion with Mixed Data Types

Sun, Daqian, Wells, Martin T.

arXiv.org Machine Learning

We consider the matrix completion problem of recovering a structured low rank matrix with partially observed entries with mixed data types. Vast majority of the solutions have proposed computationally feasible estimators with strong statistical guarantees for the case where the underlying distribution of data in the matrix is continuous. A few recent approaches have extended using similar ideas these estimators to the case where the underlying distributions belongs to the exponential family. Most of these approaches assume that there is only one underlying distribution and the low rank constraint is regularized by the matrix Schatten Norm. We propose a computationally feasible statistical approach with strong recovery guarantees along with an algorithmic framework suited for parallelization to recover a low rank matrix with partially observed entries for mixed data types in one step. We also provide extensive simulation evidence that corroborate our theoretical results.


Matrix reconstruction with the local max norm

Foygel, Rina, Srebro, Nathan, Salakhutdinov, Russ R.

Neural Information Processing Systems

We introduce a new family of matrix norms, the ''local max'' norms, generalizing existing methods such as the max norm, the trace norm (nuclear norm), and the weighted or smoothed weighted trace norms, which have been extensively used in the literature as regularizers for matrix reconstruction problems. We show that this new family can be used to interpolate between the (weighted or unweighted) trace norm and the more conservative max norm. We test this interpolation on simulated data and on the large-scale Netflix and MovieLens ratings data, and find improved accuracy relative to the existing matrix norms. We also provide theoretical results showing learning guarantees for some of the new norms. Papers published at the Neural Information Processing Systems Conference.


Factor Group-Sparse Regularization for Efficient Low-Rank Matrix Recovery

Fan, Jicong, Ding, Lijun, Chen, Yudong, Udell, Madeleine

arXiv.org Machine Learning

This paper develops a new class of nonconvex regularizers for low-rank matrix recovery. Many regularizers are motivated as convex relaxations of the matrix rank function. Our new factor group-sparse regularizers are motivated as a relaxation of the number of nonzero columns in a factorization of the matrix. These nonconvex regularizers are sharper than the nuclear norm; indeed, we show they are related to Schatten-$p$ norms with arbitrarily small $0 < p \leq 1$. Moreover, these factor group-sparse regularizers can be written in a factored form that enables efficient and effective nonconvex optimization; notably, the method does not use singular value decomposition. We provide generalization error bounds for low-rank matrix completion which show improved upper bounds for Schatten-$p$ norm reglarization as $p$ decreases. Compared to the max norm and the factored formulation of the nuclear norm, factor group-sparse regularizers are more efficient, accurate, and robust to the initial guess of rank. Experiments show promising performance of factor group-sparse regularization for low-rank matrix completion and robust principal component analysis.


HMLasso: Lasso for High Dimensional and Highly Missing Data

Takada, Masaaki, Fujisawa, Hironori, Nishikawa, Takeichiro

arXiv.org Machine Learning

Sparse regression such as Lasso has achieved great success in dealing with high dimensional data for several decades. However, there are few methods applicable to missing data, which often occurs in high dimensional data. Recently, CoCoLasso was proposed to deal with high dimensional missing data, but it still suffers from highly missing data. In this paper, we propose a novel Lasso-type regression technique for Highly Missing data, called `HMLasso'. We use the mean imputed covariance matrix, which is notorious in general due to its estimation bias for missing data. However, we effectively incorporate it into Lasso, by using a useful connection with the pairwise covariance matrix. The resulting optimization problem can be seen as a weighted modification of CoCoLasso with the missing ratios, and is quite effective for highly missing data. To the best of our knowledge, this is the first method that can efficiently deal with both high dimensional and highly missing data. We show that the proposed method is beneficial with regards to non-asymptotic properties of the covariance matrix. Numerical experiments show that the proposed method is highly advantageous in terms of estimation error and generalization error.


Beyond the One Step Greedy Approach in Reinforcement Learning

Efroni, Yonathan, Dalal, Gal, Scherrer, Bruno, Mannor, Shie

arXiv.org Machine Learning

The famous Policy Iteration algorithm alternates between policy improvement and policy evaluation. Implementations of this algorithm with several variants of the latter evaluation stage, e.g, $n$-step and trace-based returns, have been analyzed in previous works. However, the case of multiple-step lookahead policy improvement, despite the recent increase in empirical evidence of its strength, has to our knowledge not been carefully analyzed yet. In this work, we introduce the first such analysis. Namely, we formulate variants of multiple-step policy improvement, derive new algorithms using these definitions and prove their convergence. Moreover, we show that recent prominent Reinforcement Learning algorithms are, in fact, instances of our framework. We thus shed light on their empirical success and give a recipe for deriving new algorithms for future study.